79 research outputs found

    Optimizing Queries to Remote Resources

    Get PDF
    One key property of the Semantic Web is its support for interoperability. Recent research in this area focuses on the integration of multiple data sources to facilitate tasks such as ontology learning, user query expansion and context recognition. The growing popularity of such machups and the rising number of Web APIs supporting links between heterogeneous data providers asks for intelligent methods to spare remote resources and minimize delays imposed by queries to external data sources. This paper suggests a cost and utility model for optimizing such queries by leveraging optimal stopping theory from business economics: applications are modeled as decision makers that look for optimal answer sets. Queries to remote resources cause additional cost but retrieve valuable information which improves the estimation of the answer set's utility. Optimal stopping optimizes the trade-off between query cost and answer utility yielding optimal query strategies for remote resources. These strategies are compared to conventional approaches in an extensive evaluation based on real world response times taken from seven popular Web services

    Slides: Context Aware Sentiment Detection

    Get PDF
    The simplicity of using Web publishing services and social networking platforms has resulted in an abundance of user-generated content. A significant portion of this content contains user opinions with clear economic relevance - customer and travel reviews, for example, or the articles of respected bloggers who influence purchase decisions. Analyzing and acting upon user-generated content is therefore becoming imperative for marketers and social scientists who need to gather feedback from very large user communities. In order to identify trends in user-generated content and compare differing perceptions of interest groups, automated sentiment detection identifies and aggregates polar opinions (i.e. positive or negative statements about facts). For achieving accurate results, sentiment detection requires a correct interpretation of natural languages, which remains a challenging task due to their inherent ambiguities. Most approaches to sentiment detection are based on the notion that there is a stable conceptual connection between words and their adjacent text, but neglect the context of opinionated terms when trying to resolve ambiguities. To address this limitation, the presenters will discuss an approach based on contextualized sentiment lexicons and introduce a domain-specific method for the automated refinement of such lexicons

    A Utility Centered Approach for Evaluating and Optimizing Geo-Tagging

    Get PDF
    Geo-tagging is the process of annotating a document with its geographic focus by extracting a unique locality that describes the geographic context of the document as a whole. Accurate geographic annotations are crucial for geospatial applications such as Google Maps or the IDIOM Media Watch on Climate Change, but many obstacles complicate the evaluation of such tags. This paper introduces an approach for optimizing geo-tagging by applying the concept of utility from economic theory to tagging results. Computing utility scores for geo-tags allows a fine grained evaluation of the tagger's performance in regard to multiple dimensions specified in use case specific domain ontologies and provides means for addressing problems such as different scope and coverage of evaluation corpora. The integration of external data sources and evaluation ontologies with user profiles ensures that the framework considers use case specific requirements. The presented model is instrumental in comparing different geo-tagging settings, evaluating the effect of design decisions, and customizing geo-tagging to a particular use cases

    Name Variants for Improving Entity Discovery and Linking

    Get PDF
    Identifying all names that refer to a particular set of named entities is a challenging task, as quite often we need to consider many features that include a lot of variation like abbreviations, aliases, hypocorism, multilingualism or partial matches. Each entity type can also have specific rules for name variances: people names can include titles, country and branch names are sometimes removed from organization names, while locations are often plagued by the issue of nested entities. The lack of a clear strategy for collecting, processing and computing name variants significantly lowers the recall of tasks such as Named Entity Linking and Knowledge Base Population since name variances are frequently used in all kind of textual content. This paper proposes several strategies to address these issues. Recall can be improved by combining knowledge repositories and by computing additional variances based on algorithmic approaches. Heuristics and machine learning methods then analyze the generated name variances and mark ambiguous names to increase precision. An extensive evaluation demonstrates the effects of integrating these methods into a new Named Entity Linking framework and confirms that systematically considering name variances yields significant performance improvements
    • …
    corecore